3574 results found.
Written
Contextualized Embeddings,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution-Noncommercial (CC BY-NC) 4.0 Germany
Size:
150 MByte Production Status:
Newly created-finished
Use:
Semantic Language Representation
-
Paper title:ProGene - A Large-scale, High-Quality Protein-Gene Annotated Benchmark Corpus
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Erik Faessler | PubMed Gene Flair Embeddings | /N |
Documentation:
English documentation available in the download package.
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Apache License 2.0
Size:
866 coreference chains OtherProduction Status:
Newly created-finished
Use:
Anaphora, Coreference
-
Paper title:A Study on Entity Resolution for Email Conversations
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Parag Pravin Dakle | Seed corpus for entity coreference in email conversations | /N |
Documentation:
Basic documentation has been provided in English. Documentation will be released with the corpus.
Written
Corpus Tool,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
OpenSource
Size:
23 KByte Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:WIKIR: A Python Toolkit for Building a Large-scale Wikipedia-based English Information Retrieval Dataset
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Jibril Frej | wikIR | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons Attribution 4.0 International
Size:
56622 entries Production Status:
Newly created-finished
Use:
Machine Learning
-
Paper title:MAGPIE: A Large Corpus of Potentially Idiomatic Expressions
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hessel Haagsma | MAGPIE Corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
CreativeCommons
Size:
66.3 KByte Production Status:
Newly created-finished
Use:
Emotion Recognition/Generation
-
Paper title:CEASE, a Corpus of Emotion Annotated Suicide notes in English
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Soumitra Ghosh | CEASE | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
CreativeCommons
Size:
49 topics X 10 summaries and 11 dimensions OtherProduction Status:
Newly created-finished
Use:
Summarisation
-
Paper title:A Data Set for the Analysis of Text Quality Dimensions in Summarization Evaluation
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Eneldo Loza Mencía | DIP-SumEval | /N |
Documentation:
English documentation
Written
Evaluation Data,
Language Type:
Monolingual
Languages:
English
Availability:
Freely Available
License:
Creative Commons CC BY-SA 4.0
Size:
150,000 questions OtherProduction Status:
Existing-used
Use:
Question Answering
-
Paper title:Chat or Learn: a Data-Driven Robust Question-Answering System
-
Paper track:Evaluation/poster presentation with demo
-
Paper status:Accept Poster+Demo
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Andrei Popescu-Belis | Stanford Question Answering Dataset (SQuAD) | /N |
Documentation:
Same URL, articles published
Written
Corpus,
Language Type:
Monolingual
Languages:
English
Availability:
From Owner
License:
Size:
None Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:Corpus for Modeling User Interactions in Online Persuasive Discussions
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Ryo Egawa | SemEU-R CMV Corpus | /N |
Documentation:
None
Written
Corpus Tool,
Language Type:
Multilingual
Languages:
Dutch English Italian
Availability:
Freely Available
License:
CreativeCommons
Size:
100000 sentences Production Status:
Newly created-in progress
Use:
Semantic Role Labeling
-
Paper title:Large-scale Cross-lingual Language Resources for Referencing and Framing
-
Paper track:Written/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Piek Vossen | MWEP toolkit | /N |
Documentation:
English
Written
Corpus,
Language Type:
Bilingual
Languages:
English Japanese
Availability:
Freely Available
License:
Research purposes only
Size:
8763995 sentences Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:JParaCrawl: A Large Scale Web-Based English-Japanese Parallel Corpus
-
Paper track:Written/poster presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Makoto Morishita | JParaCrawl | /N |
Documentation:
None




